
* merge census and voting data with account numbers

**********************************************************************

* merge census data with account numbers
use Data/census_SA1_match.dta, clear
	
* get some missing SA1s	
merge 1:1 account_number using Data/missing_SA1_matched.dta, update

	keep SA1 account_number

* more missing SA1s	
rename account_number customer_number	
	
merge 1:1 customer_number using Data/SA1_match.dta, update replace	
	
rename customer_number account_number	
drop _merge	
merge m:1 SA1 using Data/Census11_SA1.dta
	drop if _merge ==2
	drop _merge

	sort account_number

save Data/CustomerCensusSA1.dta, replace


***********
	* merge voting results with SA1 match file
	
use OrigData/Census2011/SA1/ElectoralDivisions_SA1.dta, clear


merge m:1 electoral_division using OrigData/VotingData/Voting2013_Division.dta	
	
	drop if _merge==1	
	drop _merge

	destring SA1, replace

merge 1:m SA1 using Data/CustomerCensusSA1.dta

	drop if _merge ==1
	drop _merge

	drop DivisionID 

save Data/CustomerCensusVoteSA1.dta, replace

*****************************************************************
	
** match households to polling places for disaggregated voting data

clear
use OrigData/VotingData/SA1_attr.dta, clear

	keep if state_name_2011 =="Victoria"

	keep sa1_7digitcode_2011 x y 

	rename sa1 SA1

merge 1:m SA1 using OrigData/Census2011/SA1/ElectoralDivisions_SA1.dta
	drop _merge
save Data/SA1vote_coords.dta, replace

*************************
** import polling place locations file (has CCD identifier as well as latitude and longitude)

clear
insheet using "OrigData/VotingData/PollingPlace_locations.csv", comma

	keep if stateco==3

	keep div* ppname premises address* locality postcode ppid lat longi ccd status

	rename ppid PPId
	sort PPId

save Data/Polling_locations.dta, replace 
** merge with voting data at polling places

merge m:1 PPId using OrigData/VotingData/Voting2013_PollingPlace.dta
	drop if _merge ~=3
	drop _merge

	drop if status =="Abolition"
	joinby electoral_division using Data/SA1vote_coords.dta

** need to match xSA1 ySA1 (coordindates of each SA1) to closest (using geodist) Lat Long of polling place (do within electoral division is quicker)

	geodist ySA1 xSA1 latt longi, gen(dist) 

	sort SA1 dist
collapse (first) ppname locality postcode PPId latt longi PollingPlace P_pollingGRN P_pollingNP P_pollingLP P_pollingALP green_state DivisionID electoral_division xSA1 ySA1 dist, by(SA1)

save Data/Polling_SA1_match.dta, replace

use Data/Polling_SA1_match.dta, clear

merge 1:m SA1 using Data/CustomerCensusVoteSA1.dta

	drop if _merge==1

	drop _merge



	foreach var in ALP LP NP GRN {
		replace P`var' =0 if  P`var'  ==. & member ~=""
		replace P_polling`var' =0 if P_polling`var'==. & member ~=""
		}



save Data/CustomerCensusVoteSA1.dta, replace



